Skip to content

Conversation

leventov
Copy link

Summary

Don't hold lock unless necessary in PoolByteStream.close().

This lock causes significant contention and slowdown in LiteLLM when used with streaming but sync HTTP transport.

Testing: this patch was validated through load testing that decreased lock contention (validated with py-spy) and significantly decreased wall clock completion time of concurrent streaming LLM requests via LiteLLM.

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant